PROSE: Perceptual Risk Optimization for Speech Enhancement

نویسندگان

  • Jishnu Sadasivan
  • Chandra Sekhar Seelamantula
  • Nagarjuna Reddy Muraka
چکیده

The goal in speech enhancement is to obtain an estimateof clean speech starting from the noisy signal by minimizing a chosendistortion measure, which results in an estimate that depends onthe unknown clean signal or its statistics. Since access to suchprior knowledge is limited or not possible in practice, one hasto estimate the clean signal statistics. In this paper, we developa new risk minimization framework for speech enhancement, inwhich, one optimizes an unbiased estimate of the distortion/riskinstead of the actual risk. The estimated risk is expressed solely as afunction of the noisy observations. We consider several perceptuallyrelevant distortion measures and develop corresponding unbiasedestimates under realistic assumptions on the noise distribution anda priori signal-to-noise ratio (SNR). Minimizing the risk estimatesgives rise to the corresponding denoisers, which are nonlinearfunctions of the a posteriori SNR. Perceptual evaluation of speechquality (PESQ), average segmental SNR (SSNR) computations, andlistening tests show that the proposed risk optimization approachemploying Itakura-Saito and weighted hyperbolic cosine distortionsgives better performance than the other distortion measures. ForSNRs greater than 5 dB, the proposed approach gives superiordenoising performance over the benchmark techniques based on theWiener filter, log-MMSE minimization, and Bayesian nonnegativematrix factorization.Index Terms — Speech enhancement, perceptual distortion measure,unbiased risk estimation, Stein’s lemma, objective and subjectiveassessment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Shuffled Sub-swarm Particle Swarm Optimization Algorithm for Speech Enhancement

In this paper, we propose a novel algorithm to enhance the noisy speech in the framework of dual-channel speech enhancement. The new method is a hybrid optimization algorithm, which employs the  combination of  the  conventional θ-PSO and the shuffled sub-swarms particle optimization (SSPSO) technique. It is known that the θ-PSO algorithm has better optimization performance than standard PSO al...

متن کامل

Perceptual Factor Analysis for Speech Enhancement

This paper presents a new speech enhancement approach originated from factor analysis (FA) framework. FA is a data analysis model where the relevant common factors can be extracted from observations. A factor loading matrix is found and a resulting model error is introduced for each observation. Interestingly, FA is a subspace approach properly representing the noisy speech. This approach parti...

متن کامل

Speech Enhancement Through an Optimized Subspace Division Technique

The speech enhancement techniques are often employed to improve the quality and intelligibility of the noisy speech signals. This paper discusses a novel technique for speech enhancement which is based on Singular Value Decomposition. This implementation utilizes a Genetic Algorithm based optimization method for reducing the effects of environmental noises from the singular vectors as well as t...

متن کامل

Speech Enhancement Through an Optimized Subspace Division Technique

The speech enhancement techniques are often employed to improve the quality and intelligibility of the noisy speech signals. This paper discusses a novel technique for speech enhancement which is based on Singular Value Decomposition. This implementation utilizes a Genetic Algorithm based optimization method for reducing the effects of environmental noises from the singular vectors as well as t...

متن کامل

A Heuristic Speech De-noising with the aid of Dual Tree Complex Wavelet Transform using Teaching-Learning Based Optimization

Abstract— In our present work, we propose a nature inspired population based speech enhancement technique to find the dynamic threshold value using Teaching-Learning Based Optimization (TLBO) algorithm by using shift invariant property of dual tree complex wavelet transform (DT-CWT). The performance of these proposed methods are evaluated in terms of Perceptual Evaluation of Speech Quality (PES...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1710.03975  شماره 

صفحات  -

تاریخ انتشار 2017